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Abstract 

Consider a wireless MIMO multi-hop channel with n s non-cooperating source antennas and fully cooperating 
destination antennas, as well as L clusters containing k non-cooperating relay antennas each. The source signal 
traverses all L clusters of relay antennas, before it reaches the destination. When relay antennas within the same 
cluster scale their received signals by the same constant before the retransmission, the equivalent channel matrix 
H relating the input signals at the source antennas to the output signals at the destination antennas is proportional 
to the product of channel matrices H;, I = 1, . . . ,L + 1, corresponding to the individual hops. We perform an 
asymptotic capacity analysis for this channel as follows: In a first instance we take the limits n s oo, rtd ^ oo 
and k — > 00, but keep both n s /n<j and k/rid fixed. Then, we take the limits L — ► 00 and fe/n<j —* 00. Requiring 
that the H/'s satisfy the conditions needed for the Marcenko-Pastur law, we prove that the capacity scales linearly 
in min{n s ,nd}, as long as the ratio k/nd scales at least linearly in L. Moreover, we show that up to a noise 
penalty and a pre-log factor the capacity of a point-to-point MIMO channel is approached, when this scaling is 
slightly faster than linear. Conversely, almost all spatial degrees of freedom vanish for less than linear scaling. 
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I. Introduction 

We consider coherent wireless multiple-input-multiple-output (MIMO) communication between n s non- 
cooperating source antennas and n d fully cooperating destination antennas. In this paper it is assumed 
that the source antennas are either far apart or shadowed from the destination antennas. The installation 
of intermediate nodes that relay the source signals to the destination (multi-hop) is well known for being 
an efficient means for improving the energy-efficiency of the communication system in this case. In the 
resulting network, the signals traverse L clusters containing k relay antennas each, before they reach 
the destination. Generally, signals transmitted by the source antennas might not only be received by the 
immediately succeeding cluster of relay antennas, but possibly also by clusters that are farther away or 
by the destination. While such receptions could well be exploited for achieving higher transmission rates, 
we assume them to be strongly attenuated and ignore them in this paper. 

In the most basic MIMO multi-hop network architecture, the relay antennas in the clusters do not 
cooperate. Since non-cooperative decoding of the interfering source signals at the individual relay antennas 
drastically reduces the achievable rate in the network, a simple amplify-and-forward operation becomes 
the relaying strategy of choice. That is, at each antenna a scaling of the received signals by a constant is 
performed before the retransmission. While this approach is cheap in terms of computational complexity, 
and also does not require any channel-state information at the relay nodes, it clearly suffers from noise 
accumulation. This basic network has been studied extensively by Borade, Zheng and Gallager for 
independent identically distributed (i.i.d.) Rayleigh fading channel matrices. In references [1], [2] they 
showed for n = n s = n& = k that all n spatial degrees of freedom are available in this network for a fixed 
L at high signal-to-noise ratio (SNR). More generally, they also showed that all degrees of freedom are 
available, if L as a function of the SNR fulfills lim sntwoo L(snr)/ logsnr = 0. 

While this result gives a design criterion how the SNR should be increased with the number of hops 
in the network, it does not give any insights into the eigenvalue distribution of the product of random 
matrices C specifying the mutual information between input and output of the vector channel. For fixed 
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L, only recently this eigenvalue distribution has been characterized in the large antenna limit [3]. Based 
on a theorem from large random matrix theory [4], the authors showed that it converges to a deterministic 
function, and gave a recursive formula for the corresponding Stieltjes transform. Moreover, the reference 
reports that the asymptotic eigenvalue distribution of C, which is in fact the product of the signal covariance 
matrix R s and inverse noise covariance matrix R^ 1 at the destination, approaches the Marcenko-Pastur 
law in the large dimensions limit for (3 r = k/n d — > oo, but f3 s = n s /n d and L fixed. Since the Marcenko- 
Pastur law is also the limiting eigenvalue distribution of the classical point-to-point MIMO channel, this 
means that up to a noise penalty and a pre-log factor the point-to-point capacity is approached in this 
case. 

By considering the limiting eigenvalue distributions of the signal and noise covariance matrices sepa- 
rately, we are able to generalize this result for the case L — > oo in this paper. In essence, we show that 
P r needs to grow at least linearly with L in order to sustain a non-zero fraction of the spatial degrees of 
freedom in the system, i.e., linear capacity scaling in min{n s ,n d }. Moreover, when the scaling is faster 
than linear, the limiting eigenvalue distribution of C is given by the Marcenko-Pastur law. That is, we are 
able exploit the spatial degrees of freedom without increasing the SNR at the receiver at the expense of 
employing more relay antennas. Returning to the result by Borade et al., where degrees of freedom are 
sustained by increasing the SNR, according to our result the number of relays per layer can be seen as a 
second resource besides the transmit power for compensating the capacity loss in the multi-hop network. 

Another contribution of this paper lies in bridging the gap between the results obtained by Muller in 
reference [5] on the one hand and by Morgenshtern and Bolcskei in references [6], [7] on the other hand. 
In the first reference, it is shown in the large dimensions limit that almost all singular values of a product 
of independent random matrices fulfilling the conditions needed for the Marcenko-Pastur law go to zero 
as the number of multipliers grows large, while the aspect ratios of the matrices are kept finite. This 
implies that almost all spatial degrees of freedom in a MIMO amplify-and-forward multi-hop network as 
described above vanish as L goes to infinity. On the other hand, [6], [7] were the first papers which proved 
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in the large dimensions limit (for L — 1) that the capacity of a point-to-point MIMO link is approached 
up to a noise penalty and a pre-log factor, if f3 r — > oo and f3 s is kept fixed. In [8] the same result had 
been proven for the less general case that n s and n d are fixed and k — > oo. The mechanisms discovered 
in these papers apparently act as antipodal forces with respect to the limiting eigenvalue distributions of 
products of random matrices. While increasing the number of hops distorts this distribution in an undesired 
fashion, increasing the ratio between the number of relays and destination antennas allows for recovering 
the original distribution corresponding to a point-to-point channel. In this paper, we answer the question 
how these two effects can be balanced, i.e., how fast must f3 r grow with L in order to sustain a non-zero 
fraction of spatial degrees of freedom as L grows without bounds. 

II. Notation 

The superscripts H and * stand for conjugate transpose and complex conjugate, respectively. denotes 
the expectation operator with respect to the random variable A. det(A), Tr(A) and A«{A} stand for 
determinant, trace and the ith eigenvalue of the matrix A. a(i) is the ith element of the vector a. Throughout 
the paper all logarithms, unless specified otherwise, are to the base e. ||a|| denotes the Euclidean norm of 
the vector a, ||A|| Tr the Trace norm of the matrix A. By Pr[A] we denote the probability of the event A. 

Furthermore, we use the standard C(-), £!(•),©(•) notations for characterizing the asymptotic behavior 
of some function /(•) according to 

fin) e 0(g(n)) if 3M,n > : M\g(n)\ > |/(n)|,Vn > n , 
f(n)en(g(n)) if 3M, n > : M\g(n)\ < \f(n)\, Vn > n , 
f(n) e Q(g(n)) if f(n) e 0(g(n)) and f(n) e Sl(g(n)). 

Finally, we define the function l{x} to be 1 if a; is true and zero otherwise. 5(x) and a(x) denote Dirac 
delta and Heaviside step function, respectively. 
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III. Transmission Protocol & System Model 



We label the clusters of relay antennas by Ci, ... ,Cl- Cluster C\ denotes the one next to the destination, 
C L the one next to the sources (refer to Fig.Q])- We assume the L + 1 single hop channels between sources, 
relay clusters and destinations to be frequency-flat fading over the bandwidth of interest, and divide the 
transmission into L + 1 time slots. In time slot T = 1 the sources transmit to Cl- The transmission 
is described by the transmit vector s £ C n % the matrix H^+i £ C kxn % representing the vector channel 
between sources and Cl, the vector £ C fe , representing the receiver front-end noise introduced in Cl, 
the receive vector y L £ C h and the linear mapping 



Here and also in the subsequent equations, the ith elements of the transmit, receive and noise vectors 
correspond to the itb antenna in the respective network stage. 

The time slots T = {2, . . . , L] are used for relaying the signals from cluster to cluster. In time slot T 
the relay antennas in Cl-t+2 transmit scaled versions of the signals received in time slot T — 1 to Cl-t+i- 
That is, with I = L — T + 2 the transmit vector of the lib relay cluster r/ £ C k is computed from the 
respective receive vector £ C h according to 



where £ K is a cluster specific constant of proportionality specifying the ratio between receive and 
transmit power. The transmission in time slot T is then described by 



Here, H ; £ C hxk represents the channel between C t and CVi, £ C k the receiver front-end noise 
introduced in Ci-i, and y/_i £ C k is the corresponding receive vector. Thus, the signals traverse one 
hop per time slot. In time slot T = L + 1, Ci finally forwards its received signals to the destination. 



y L = H L+ iS + n L . 




Again, the transmit vector is computed according to ri 




%yi- Denoting the matrix representing the 



channel between C\ and the destination by H ls the vector representing the receiver front-end noise at the 
destination by n d e C" d and the receive vector by y e C nd this transmission is described by 

y = Hiri + n d . 

Putting everything together, the input-output relation of the channel as seen from source to destination 
antennas over L + 1 time slots can be written as 



y 



. . . Hl+iS + „ d + £ j nk^ Hi . . . Hini 



We model the entries of all noise vectors as zero-mean circular symmetric complex Gaussian random 
variables of unit-variance that are white both in space and time. The channel matrices are independent 
and their elements are assumed to be i.i.d. zero-mean random variables of unit-variance. Moreover, we 
impose the per antenna power constraints E[s(i)s(i)*] = P/n s and E[rj(i)rj(i)*] = P/k for I — {1, . . . , L}. 
The relay antenna power constraints are fulfilled, if the scaling factors ccj satisfy a = cti = . . . = oil = 
P/(P + 1). 

IV. Ergodic Capacity & Convergence of Eigenvalues 

While full cooperation and the presence of full channel-state information is assumed at the destination 
antennas, source and relay antennas do not cooperate and also do not possess any channel-state information. 
Under these assumptions, the ergodic mutual information 7(s; y) is maximized, when the entries of s are 
zero-mean circularly symmetric complex Gaussian random variables of variance P/n s that are white over 
both space and time [9]. For this input distribution, J(s; y) is fully characterized by the joint probability 
distribution of the eigenvalues of the product C = R S R~\ where R s G C™ dXnd and R n e C TldXnd denote 
the signal and noise covariance matrices at the destination of the multi-hop channel. These covariance 



7 



matrices are given by 



a 
k 1 



Hi • • • H L+ iss H H^ +1 • • • H 



Pa 1 



n. 



k L 



Hi H L+1 Hf +1 • • • Hf , 



Rn E nd ni ni 



L+1 
L 



nd + EV^ Hi '" H ^J ( v nd + EVF Hi 

= In d +E(f)' H l- H ' H ?- H ?- 

We define the the empirical eigenvalue distribution (EED) of the matrix A as 

1 n 

F A ) ( a; ) = -E 1 ^{A}<x}. 



H ; n, 



(1) 



i=i 



With this notation the ergodic capacity of the multi-hop channel in nats per channel use is obtained as 

1 



C 



L + 1 
1 

L + 1 
1 

L + 1 



E c [logdet (I d + C)] 



E c 
E c 



n d 



^io g (i + MC}) 
.i=i 

POO 

/ log(l + ar) -dF^ d \x) 
Jo 



(2) 



Note that the pre-log factor (L + accounting for the use of L + 1 time slots can be lowered by 
initiating the next source antenna transmission after L < L time slots. From a practical perspective L 
needs to be chosen large enough, such that the interference imposed on the previously transmitted signal 
is negligible. It is important to have this fact in mind whenever we take the limit L — > oo, which formally 
drives the ergodic capacity to zero. In this paper we are interested in the scaling of the capacity in the 
number of source and destination antennas. Accordingly, we focus on the case where both these quantities 
grow large. From [3] we know that F^\x) converges almost surely (a.s.) to some asymptotic distribution 
Fc(x), when n s — > oo, n d — > oo and k — > oo, but the ratios f3 s = n s /n d and (3 r = k/rid are fixed. Here, 
we mean by the convergence of an EED F^\x) to some deterministic function F A (x) that 



Pr 



lim sup | F^ (x) 



F A (x)\=0 



= 1. 



We will refer to the density /a(x) = 4-Fa_(x) as the limiting spectral measure (LSM) subsequently. 
Returning to the capacity expression ©, we can infer that for n s — > oo, — > oo and k — ► oo, and j3 8 

and (3 r fixed, C converges to the quantity defined as 

1 f°° 

Coo = j— j ■ J log(l + a?) • f c (x) ■ dx. 

V. Preliminaries From Random Matrix Theory 
We briefly repeat some preliminaries from the theory of large random matrices subsequently. 

A. Stieltjes Transform 

We define the Stieltjes transform of some LSM /(■) as 

oo 



G{s) ± \ M- - dx. (3) 



CO 



We stick to the definition in [5] here, while it is generally more common to define the Stieltjes transform 
with a minus sign in the denominator. The LSM is uniquely determined by its Stieltjes transform. While 
it is impossible to find a closed form expression for the LSM of a random matrix in many cases, implicit 
(polynomial) equations for the corresponding Stieltjes transform can sometimes be obtained. Accordingly, 
the Stieltjes transform plays a prominent role in large random matrix theory. A transform pair appearing 
again and again in the course of this paper is the following: 

f(x) = S(x - x ) T G(s) = (4) 

s — x 

B. Marcenko-Pastur Law 

The result presented in this paper is valid for the class of random matrices fulfilling the conditions for 
the Marcenko-Pastur law [10], [11], which we briefly repeat here. Let X G C h ° xkl be a random matrix 
whose entries are i.i.d. zero-mean distributed and of unit-variance. If both k — >• oo and ki — > oo, but 
/3 = ki/ko is kept finite, then the Stieltjes transform of the LSM of ^-XX H is given by 

WipW - ^jpi ■ (5) 

The corresponding LSM can be written in closed form. 



C. Concatenated Vector Channel 

Our result is based on a theorem on the concatenated vector channel proven in [5] using the S-transform 
[12], [13]: 

Theorem 1. Let Mi e C koXkl , . . . ,M N e C kN - lXkN be independent random matrices fulfilling the con- 
ditions for the Marcenko-Pastur law, whose elements are of unit-variance, and define f3 n = Then the 
Stieltjes transform of the LSM corresponding to 1/ {k\ • • • fcjy) M x • • • M^M^ • • • M^ 1 fulfills the implicit 
equation 

G^ G(8 )-l +/ W +8G(8) = 1 (6) 

Note that we normalize with respect to k N rather than with respect to k as done in [5]. The Stieltjes 
transform G(s) therein relates to G(s) as G(s) = (3 N G(f3 N s). 

VI. Capacity Scaling 

We formalize our result in the subsequent theorem. It is important to note that taking the limits in the 
LSM of a random matrix means that the dimensions of this matrix are already taken to infinity. For the 
case at hand that is, we first take the limits n s — > oo, n d — > oo and k — > oo, and then take the limits 
L — > oo and (3 r — > oo. Whenever we take the limits L — > oo and f3 r — > oo in an LSM of some random 
matrix A or in the corresponding Stieltjes transform, we denote the asymptotic expressions by 

ft\x)= lim /a(x) and G^W = Um C7 A ( S ). 

L,p r —>oo L,fj r —*oo 

Theorem 2. Le? Hi e C ndXfc , H 2 , . . . , H L e C kxk and H i+1 e C fex " s independent random matrices 
with elements of unit-variance fulfilling the conditions needed for the Marcenko-Pastur law and define 
P s = n s /rid and (3 r = k/n&. Let snr be a positive constant. Then, the Stieltjes transform corresponding 
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to the LSM of 



C = R s R- 1 = Pa L ■ R s ■ 



1=0 



with P 



a 



1 — a 



L+l 



[1 — a) ■ a L 



snr, 



p 


1 + p' 


n s 




H 


K 







AL+lA± i+1 1 5 

, if Z = 0, 

^H 1 ---H / Hf---HH ,else. 
in the limits j3 r — > oo, L — > oo, /3 s ^eJ, converges to 

s- 1 ,if L G fi(/3 r 1+£ ),e > 0. 

Furthermore, if L e 6(/3 r ) Stieltjes transform converges to some G^^s) 7^ s" 1 in ?/ii5 /imi?. 



Gg°)( a ) 



(7) 



We have introduced the parameters P and snr such that they correspond to the average transmit power 
at the source and the average SNR at the destination. The case G^\s) = -^G^p (-^) corresponds to a 
point-to-point MIMO channel scenario and thus generalizes [7] and [3] in the sense that it gives a condition 
on how fast the number of relays per layer needs to grow with the number of hops in order to approach 
the Marcenko-Pastur law for increasing L. The case G^\s) = s^ 1 can be seen as a generalization of 
Theorem 4 in [5], which states that almost all eigenvalues vanish, i.e., f£°\x) = 8(x) a.s., if the aspect 
ratios f3 r remain finite. The case of linear scaling of f3 r in L constitutes the threshold between the previous 
two regimes, but still suffices in order to sustain a non-zero fraction of the spatial degrees of freedom. In 
summary, the theorem thus states that the capacity scales linearly in min{n s ,n d } as long as k scales at 
least linearly in both L and min{n s , rid} and the SNR at the destination is kept constant. 

VII. Asymptotic Analysis 

This section provides the proof of Theorem 2. We start out with the lemma below, which will allow for 
inferring a corollary to Theorem 1 . This corollary is the key to the proof of Theorem 2 in this section. In 
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order not to interrupt the logical flow of the section two rather technical lemmas required for the proof 
of Theorem 2, are stated and proven in the Appendix of this paper. 



lim ( - + 1 



Lemma 1. For any e > 0, some function g : IR — > R+, and a positive constant c 

9{K) | i, if jWg^'I, 

oo, if g(n) E VI(k 1+£ ). 
Furthermore, if g(n) E 9(k) there exist constants M 2 > Mi > 0, such that 

c 



liminf + ' = e cMl 



lim sup ( — hi 



e cM \ 



Proof. The lemma follows from the fact that the limit can be taken inside a continuous function, which 
allows as to write 

lim (- + l) 9i ) = exp ( lim g(n) ■ log (- + lY) , (8) 
and the rule of Bernoulli-1' Hospital applied to g(n) = Mk 1 , where M and 7 are positive constants: 

lim Mk 1 ■ log (- + l) = CMK \ . (9) 

K-*cc \K J 7 (c + /«) 

If g{n) E 0(n 1 ~ e ), by definition there exists some M > 0, such that the exponent in © can be upper- 
bounded by 

lim g(n) ■ log (- + l) < lim Mk 1_£ • log (- + l) . (10) 

k~ >oo \K / re— >oo \K J 

Evaluating © for 7 = 1 — e renders this upper bound zero, which establishes that also the left hand side 
(LHS) of (flOl) becomes zero and ® evaluates to one in this case. 

Analogously, if g(n) E VI(k 1+£ ), there exists some M > 0, such that the exponent in ([8]) can be lower 
bounded according to 

lim g(n) ■ log (- + l) > lim Mn 1+e ■ log (- + 1 ) . (11) 
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Evaluating © for 7 = 1 + e renders the upper bound infinite, which establishes that both the LHS of 
CCD and ([8]) grow without bound in this case. 

Finally, if G Q(k), there exist constants Mi and M 2 , fulfilling Af 2 > Mi > 0, such that according 
to © evaluated for 7 = 1 the exponent in © is sandwiched between 

cMi = lim M±K ■ log (- + 1 

< lim g{n) ■ log ( — hi 

< lim M 2 k ■ log (- + l) = cM 2 , 

K— >00 / 

which establishes the second part of the lemma. ■ 

Corollary 1 of Theorem 1. With the notation and assumptions from Theorem 2 the Stieltjes transform 
of the LSM corresponding to R s in the limit k — * 00 and L — ► 00 converges to 

G™{8), if Le0(#- £ ), 

(12) 

s-\ if L6fi(/? r 1+e ). 

A/50, if L £ Q(Pr) the LSM of R s converges to a distribution corresponding to a Stieltjes transform 
Gj?° (s) 7^ s^ 1 in this limit. 

Furthermore, if L E 0{(3}~ £ ) the Stieltjes transforms of the LSMs corresponding to the R„/s, /or 
/ G {1, . . . , L}, in the limit k —>■ 00 an J L — > 00 converge to 

G i ^ l (s) = (s-1)- 1 . (13) 

The part of the corollary related to the R„ 5 ;'s is actually stated more generally than needed in this 
paper. In fact, we are only going to use that for some fixed positive integer L t < L, the LSMs of the 
R n /s, where I E {1, . . . , L t }, are given by a Dirac delta at one, i.e., have the Stieltjes transform (fT3l , 
when (3 r — ► 00, independently of the scaling of /5 r in L. This is trivially guaranteed by the corollary, since 
when L t is constant, L t E 0{(3]r e ) is fulfilled naturally as (3 r — > 00. 

Proof of Corollary We treat (fT2l) first. An implicit equation for the Stieltjes transform of the LSM 
corresponding to R s is given by ©, where we set = L + 1, (3 — 1, f3 n — (3 r for n G {1, . . . N — 1}, 
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and Pn = Ps according to our notation: 



(— ^ ~J 1 + ^+^ = 1. (14) 



(3s (3 r 

We apply Lemma 1 to \& s (/3 r , L), where we identify /3 r with « and L with <?(/«). In the limits [3 r — ► oo, 
L — >■ oo and L G C(/3j _E ) this yields \I/ s (/3 r , L) — > 1. Accordingly, ([141) simplifies to the quadratic equation 

P7 l sG^\s) + (s + 1 - G^ } = 1. (15) 

in this limit. The solution to (fT5l) is the Stieltjes transform of the Marcenko-Pastur law © with (3 = (3 s .\f 
L E Q(Pl +e ) we know from Lemma 1 that ^ s ([3 r , L) grows without bounds for L — > oo. The numbers of 
factors in the LHS of © grows with L in this case. Theorem 4 in reference [5] states that for N — > oo the 
Stieltjes transform in © converges to G^ = s _1 , if the /3 n are uniformly bounded. In fact, the conditions 
needed for this theorem when (3 = (3\ = . . . = /3jv-i, can be relaxed to N E Vl(f3 1+£ ), while the proof in 



[5] remains valid. Accordingly, the second case in (1121) follows immediately. 

disG 1 " 00 - 1 (s)—l) 

For the case L E 0(/3 r ) we know from Lemma 1 that ty s (/3 r , L) — e ft » in the limit of interest 
for some d > 0. Thus © simplifies to 

G&\s) (/ {sG t ] ^ + 1 - + a) = 1. (16) 

There exists no closed form solution to this implicit equation. However, it is easily verified that = s^ 1 
does not satisfy (TT6l ). 

For (TT3l) , we obtain the equation for the Stieltjes transform corresponding to the LSM of R n ^, where 
I E {1, . . . , L}, from © with iV = I, /3 = 1 and (3 n = (3 r for n E {1, . . . , 1} as 

A 



l-l 



G R ,(s) /sG r ,(s)-l+/3 r \ l_i , 

■ ( '"^ J ■ (a) - 1 + fir) + sG^ (s) = l. (17) 

Let's consider the case / = L first. If L E O ((3l~ e ), ^ n (f3 r , I) converges to one in the limit k — > oo and 
L — > oo by Lemma 1. Therefore, in this limit (TT7T ) simplifies to 

G£\(s) + sG£\(s) = l. (18) 
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The solution to (fl"8l ) is given by (s) — (s + l) -1 . The same is trivially true for all R nj ; with / < L, 

since whenever L E O {(3l~ £ ), this implies that also / G 0{(3l~ e ). ■ 

We are now ready to prove Theorem 2. Besides the above corollary, we use the Lemmas 2 & 3, which 
are stated and proven in the Appendix of this paper. 

Proof of Theorem 2. We go through the different scaling behaviors of (3 r with L, subsequently. 
Cases L e O (/3*~ £ ) an d L e Q ((3 r ): Firstly, we show for both these cases that the LSM of R n takes 
on the shape of a Dirac delta at some positive constant in the limit of interest. The proof is based on 
a truncation of the relay chain between the stages L t and L t — 1. By choosing L t large enough, we 
can achieve that the accumulated noise power originating from the relay stages L t , . . . , L is sufficiently 
attenuated before it reaches the destination. More specifically, we claim that for any e > there exist 
positive integers L^' and tiq(L, L t )\, such that for all n > tiq(L, L t ), for all L t > and L arbitrarily 
large a.s^ 



1 

n d 



£ (f)V..H, H H...H? 



l=L t +l 

We prove this by the following chain of inequalities: 

L 



e 

<3- 



(19) 



TV 



1 

n d 



i 



l=U 



Tr 



1 L 

<-Ya l 



l=L t 



k l 



Tr 



. 1 

< max 

l'=L t ,...,L {n d 

1 

max 
i'=L t ,...,L I n d 



k l 



1 



7 Hi Hj'H,/ Hf 



Tr>l J=Xt+l 



Tr 



a 



1 — a 



(20) 



In the first step we applied the triangle inequality and used the homogeneity of the Trace norm. In the 
second step we upper-bounded the coefficients of the a l 's by the maximum coefficient. Afterwards, we 
let the number of summands go to infinity, which strictly increases the term, since all added summands 

'We write tIq (L, Lt) in order to emphasize that no(l) is a function of L and L t 
2 The Trace norm of the matrix A € C nxn is defined as |]A||t* = £? =1 Ai{A}. 
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are positive. A standard identity for geometric series allows for eliminating the sum. All arguments of 
the max{-} function in (|20l) converge to one a.s.. This follows immediately from Theorem 3 in [5]. 
Therefore, we can choose an 71q\l, L t ) large enough, such that even the maximum of the L — L t + 1 
terms is arbitrarily close to one a.s. for all n > n^\L } L t ). In particular, if L t > log Q ((l — a) ■ s/3), we 
can make tIq\l,L%) large enough, such that (fl9l ) is fulfilled a.s. for all n > Uq (L, L t ). 

In a next step we can choose L (and thus (3 r ) large enough such that the accumulated noise power 
originating from the relays 1, . . . , L t — 1 becomes sufficiently white. This means, for the fixed L t and 
the same s as defined above there exist L > L t and n^\L, L t ), such that a.s. for all L > L and for all 
n>n^(L,L t ) 



1 

n d 



I -a 1 



L t -1 



1 — a 

The proof is similar to the one above: 

1 -a Lt 



1=0 



< 



Tr 



1 

n d 



Lt-l 



I 



1 — a 

Lt-1 



"d 



1=0 



E0 Hl ...H,H»...H? 



1 

n d 



£ a ''( Ind- F Hl "' H ^"' H? ) 

i=o ^ ' 



Tr 



Tr 



Lt-1 



<-Ya l 



In d - TJ H 1 ' " " H « H ^ • • • Hf 



Tr 



, 1 

< max 

2'e{i,...,L t -i} {n d 



!n d - TT H i HiH/ 1 • H^ 1 



Tr 



E 

/=0 



max (— l nd -± J H 1 ---H l B.f---H_f ) 



1 — a 



(21) 



Again, the first identity is a standard identity for a geometric series. In this case the convergence of 
the arguments of the max(-) function is guaranteed by Corollary 1 and Lemma 2 (refer to Appendix): 
Corollary 1 tells us that the LSMs of all the jjHt ■ ■ ■ H/H} 1 • ■ ■ H^'s, where I e {1, . . . , L t - 1}, converge 
to a Dirac at one in the limit under consideration. Knowing this, Lemma 2 guarantees us that the respective 
Trace norms go to zero. Thus, there exist L and n (L, L t ), such that a.s. the maximum of the L t — 1 
terms is small enough to make (12D) smaller than s/3 for all L > L and n > tiq\l, L t ). 
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With the choices L t = max{L^ , log^l — a) ■ e/3)} and n (L,L t ) = maxl/i^L, L t ), n|, (L, L t )}, 
we can finally conclude by the triangle inequality that for all L > L and n > n (L, L t ) a.s. 



1 

n d 



1 — a 



L+l 



< 



1 

n d 
1 

n d 
1 

n d 



1 — a 

1 - a L+l 



I — a 



I 



1 

TV n d 

L 

a 



1 — a 



L+l 



1 — a 



I 



alR n,l 



1=0 



Tr 



1 — a 

u 



-Em Hi ...h,hh... h h 

I» d + ^^-E(I)'H 1 -H,HH...HF-E(f)'H 1 ...H,HH...H? 



i=Lt 



Tr 



1 — a 



I 



"d 



H 


1 


1 - a Lt 


Tr 


n d 


1 — a 



L t -1 



1=0 



Tr 



+ 



£ £ £ 

< 3 + 3 + 3 =£ ' 



Tr 



Here, we decomposed the terms in a way that allowed us to obtain a sum of precisely the expressions we 
had proven to converge to zero before. Besides standard steps we used the fact that that \a Lt — a L \ < \a Lt \, 
since L > L t and a < 1. By Lemma 2 we have established that the LSM of R„ converges to 

l-a L+v 



1 — a 



(22) 



Note that the fact that almost all eigenvalues of the single R„/s are arbitrarily close to one each, does 
not immediately imply that this is also the case for a weighted sum of these matrices. This is due to the 
fact that for the matrix series R n> /, where I G {1, . . . L t }, the identity Afe{52i=o^n,/} = J2t=o ^k{R-n,k} 
is fulfilled, if and only if all these eigenvalues are exactly equal to one. This easy attempt of proving (1221) 
must therefore fail. Also note that the Rn/s are not asymptotically free, which prohibits arguing based 
on the respective R-transforms [13]. 

Since the eigenvalues of the corresponding inverse are the inverse eigenvalues, i.e., A^jR^ 1 } = A A T 1 {R n }, 
we conclude that the LSM of R^ 1 is given by f^l\{x) = 5(x — (1 + a)/(l — a L+1 )). Thus, by Lemma 3 



(refer to Appendix) and the respective variable transformation applied to fyl[ (■) the EED of C = snr • R s R n 1 
coincides with the EED of snr • R s , i.e., 

= —U ( —) and Gfr\s) = —G^ ( —) . 
c v i snr^ Vsnrv c w snr Rs Vsnrv 



17 

By Corollary 1 the LSM of R s is given by the Marcenko-Pastur law, in the case that L E 0(@l~ £ ), 
which establishes the first case. In the case L 6 9 ((3 r ) a non-zero fraction of the eigenvalues of R s 
remains non-zero as L — > oo by Corollary 1, i.e., G^ s (s) = Gc(s) ^ s 1 . 

Case L G 0(/5' +£ ): This case follows immediately by Corollary 1. Since asymptotically almost all 
eigenvalues of R s vanish, also almost all eigenvalues of C = R^R" 1 need to approach zero. We rely 
on the reader's intuition here, that noise cannot recover degrees of freedom. A formal proof goes along 
the lines of the proof of Lemma 3, where A is identified with R" 1 and B with R s . In the end one can 
show that the Shannon transforms of the LSMs / RsR -i(-) and fn s (-) coincide at a Dirac delta at zero in 
the limit L — > oo and [3 r — > oo. ■ 

VIII. Simulation Results 

The above results provide no evidence about speed of convergence. Since speed of convergence results 
are generally hard to obtain in a large matrix dimensions analysis, we resort to a numerical demonstration 
for this purpose. In doing so, we specify the distribution of the elements of H 1; . . . , H L+1 as circularly 
symmetric complex Gaussian with zero-mean and unit-variance. Recall that all assumptions imposed on 
this distribution in Theorem 2 were just related to its first and second moments. We fix the number 
of source and destination antennas to n = n s = = 10 and plot the normalized ergodic capacity 
Co = (L + 1) • C/n as obtained through Monte Carlo simulations versus the number of relay clusters, 
L, for an average SNR of 10 dB at the destination. The number of relays per cluster evolves with the 
number of clusters according to k = L 7 , where we vary 7 between and 3. In principle, one could use 
the recursive formula for the Stieltjes transform of C obtained in [3] rather than Monte Carlo simulations 
for generating these plots. However, the respective evaluations are handy for small L only. 

Fig. |2] shows the case of linear and faster scaling of k in n. For 7=1 the curve flattens out quickly, 
and converges to some constant which is smaller than the normalized point-to-point MIMO capacity, but 
non-zero, as expected. Furthermore, we observe that the point-to-point limit is approached the sooner the 
bigger 7 is chosen. For 7 = 3 this is the case after less than 10 hops already. Fig. |3] shows the case of 



18 

less than linear scaling of k in n. While C decreases rather rapidly for constant relay numbers (7 = 0), 
we observe that already a moderate growth of k with L slows down the capacity decay significantly. For 
7 = 0.5 an almost threefold capacity gain over the 7 = case is achieved for L = 16. For 7 = 0.75 the 
decay is tolerable even for very large L. Note that for L = 81 and 7 = 0.75 only 27 relays per layer are 
used in contrast to the 81 relays needed for linear scaling. 

IX. Conclusion 

We have given a criterion how the number of relays per stage needs to be increased with the number 
of hops in order to sustain a non-zero fraction of the spatial degrees of freedom in a MIMO amplify- 
and-forward multi-hop network, i.e., linear capacity scaling in min{n s ,n d }. The necessary and sufficient 
condition is an at least linear scaling of the relays per stage in the number of hops. 
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Appendix 

Lemma 2. A random matrix A e C nxn fulfilling lim^oo n~ 1 Tr(A) = 1 converges to the identity matrix 
a.s. in the sense that 

lim -||I„ - A|| Tr = 0, 

n— >oo ft 

if and only if its EED F^(x) converges to cr(x — 1), i.e., a.s. 

lim sup \F£(x) - a(x - 1)| = 0. 
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Proof. The lemma follows by a the following identities: 

1 1 - 

lim -\\I n -A\\ Tr = lim -V|1-MA}| 

n^oo n n— >oo fl 

i=l 

= lim - V (1 - A,{A}) + - V (A,{A} - 1) (23) 

n— >oo n ^ — ' 77 ^ — ' 

i:Ai{A}<l i:Ai{A}>l 

pi poo 

= lim / \F { ™\x)\ -dx + \F^\x) -l\-dx 
n ^°° Jo Ji 



oo 



lim / \F£ ) (x)-a(x-l)\-dx = Q (24) 



lim sup|F| n; (x) -a(x-l)\ = 0. (25) 

In (|23l) we arrange the individual summands such that they can be related to the EED of A. The 
equivalence of the norms in (|24l) and (1231 ) is established as follows: For the forward direction consider 
e(x) = \F^{x) -a(x- 1)| for x G [0,1), i.e., e(x) = |F( n )(a;)|. Choose any A G [-1,0). Since e(x) is 
monotonically increasing on the interval of interest, we can write 



/ \F ( £\x) - a(x - 1)| • dx > |A| • e(l + A). 

Jl-A 



Thus, if e(l + A) does not go to zero for all A, the integral norm cannot go to zero. The same reasoning 
can be applied for the interval A G [1, oo). 

For the backward part we break the integration in (1241) into two parts. The first integral is from zero to 
some constant d > 1. This part is a Riemann integral over a function that converges uniformly by (|25l) . It 
goes to zero by taking the limit inside the integral. The second part of the integral is from d to oo. Here, 
the limit cannot be taken inside the integral in general. However, we can write 

poo poo pd 

lim / 1 - F^\x) ■ dx = lim / 1 - F| n) (x) ■ dx - lim / 1 - F^\x) ■ dx = 1 - 1 = 0. (26) 

n->ooj d ™^°°Jo n ^°°Jo 
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The first integral on the the right hand side (RHS) converges to one by the following chain of identities: 

POO P CO y n 

lim / 1 — Fj? (x) ■ dx = lim / 1 l{Aj < x} ■ dx 

rwoo J Q ' n-*oo J Q n ^ 

■y n POO 

= lim — } / 1 — l{Aj < x} ■ dx 
n ^oo n^J 

1 n 1 
lim -V A,{A} = lim — Tr[A] = 1. 



n— too fl ' ' n— >oo 77, 
i=l 



The second term on the RHS of (l26l) is identified to converge to one by taking the limit inside the integral. 
Again, this can be done, since we deal with a Riemann integral over a uniformly convergent function. ■ 

Lemma 3. Let A, B e C nxn be positive-semidefinite random matrices with LSMs /a(^) = 5(x — 1) and 
/b(^) = i>(x), respectively. Then, the LSM of AH is given by /ab(^) = 

Proof. We separate the eigenvalues ^ = Aj{A — I n }, i E {1, . . . , n}, into two sets C\ and £ 2 . For a fixed 
e > the eigenvalues in the first set fulfill < e. The second set contains the eigenvalues which fulfill 
> e. Firstly, we show that the eigenvalues in £ 2 do not have any impact on the LSM of AB. Since 
A — I_ is Hermitian, we can write with A = I n + V,., cr Mi v i v ? 

AB = AB + Wi Y ? B > 

where Vj denotes the eigenvector corresponding to /Uj. The EED of A a.s. converges to a(x — 1). Therefore, 
the number of eigenvalues in £ 2 grows less than linearly in n. Since the VjV? 's are unit rank matrices, 
we conclude that the fraction of differing eigenvalues of AB and AB goes to zero as n — > oo. Thus, 
they also share the same LSM. 

Secondly, we show that the LSM of A -1 + pB is given by 

f k -, +p ^x)=p- l Up- l x-l). (27) 

Note that the eigenvalues of A -1 are the inverse eigenvalues of A. Therefore, we can write A -1 = I n + A, 
where a.s. for any 8 > there exists an n such that for all n > n 

max Ai{A} < 5. (28) 

ie{l,...,n} 
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Let's denote the normalized eigenvectors of A 1 + pB corresponding to Aj{A 1 + pB} by Uj. By the 
definition of an eigenvector, we can write 

(l n + A + pB - Aj{A _1 + pB} • I n ) • Ui = 

for i £ {1, . . . , n}. Taking Aiij to the RHS and taking the Eukledian norm yields 

||pBu i + (l-A i {A- 1 + pB})u i || = ||Aiii||. (29) 

By (|28l) and ||uj|| = 1, we conclude that for for all n > n a.s. also 

||Auj|| < max Aj{A} < 5. (30) 

ie{l,...,n} 

Thus, for all % the RHS of ([291 goes to zero as n — > oo. Accordingly, we conclude for the LHS that 
Uj — > Wj and Aj{A _1 + pB} — > 1 + p^t, where Vi and Wj are the ith eigenvalue and eigenvector of B. 
The respective variable transformation in the LSM of B yields (1271) . 

We complete the proof by showing that the Shannon transforms [14], [15] of / AB (-) an ^ /b(") coincide. 
Note that the Shannon transform contains the full information about the corresponding distribution. Con- 
sider the quantity £ = n~ l log det (a _1 + pB J — rT 1 log det (A -1 J . As n — > oo this quantity converges 
to the Shannon transform of /b(-)> ^b(p), a.s.: 

/*oo /»oo 

lim £ = / logx ■ f A - 1+ B (x) ■ dx - logx ■ f A ^(x) ■ dx 
n ^°° Jo Jo 

/»oo 

log(l + px) • /b(^) ■ dx — logx ■ 5(s — 1) • 

Jo 

/■oo 

= / log(l + px)f B (x) ■ dx = T B (p) 
Jo 

Rewriting £ = ra -1 log det (l n + pAB^, we see that £ also converges to the Shannon transform of / AB (-)> 
T AB (p), a.s.: 

poo poo 

limf=/ \ogxf ln+pkB (x)-dx= I log(l + px)f AB (x) ■ dx = T AB (p). 
Jo Jo 

Accordingly, we conclude that /ab(z) = /ab( x ) = /b(z) = ip(x). ' 
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H L+1 H L • • • Hx 



Cluster L Cluster L-l Cluster 1 

Fig. 1. n s non-cooperating source antennas transmit to a destination terminal with n<j antennas via L clusters of k non-cooperating relay 



antennas. 
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2.9r 




Number of Clusters L 



Fig. 2. Normalized capacity Co versus the number of relay clusters L as obtained through Monte Carlo simulations. The number of source 
and destination antennas is n = 10. The SNR at the destination is 10 dB. The number of relays per cluster k evolves according to k — n- LP . 
The dashed curve shows the normalized point-to-point MIMO capacity as a reference. 
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Fig. 3. Normalized capacity Co versus the number of relay clusters L as obtained through Monte Carlo simulations. The number of source 
and destination antennas is n = 10. The SNR at the destination is 10 dB. The number of relays per cluster k evolves according to k — n- LP . 
The dashed curve shows the normalized point-to-point MIMO capacity as a reference. 



